A 3 D - 1 D Substitution Matrix for Protein

نویسندگان

  • Danny W. Rice
  • David Eisenberg
چکیده

In protein fold recognition, a probe amino acid sequence is compared to a library of representative folds of known structure to identify a structural homolog. In cases where the probe and its homolog have clear sequence similarity, traditional residue substitution matrices have been used to predict the structural similarity. In cases where the probe is sequentially distant from its homolog, we have developed a (7 x 3 x 2 x 7 x 3) 3D-1D substitution matrix (called H3P2), calculated from a database of 119 structural pairs. Members of each pair share a similar fold, but have sequence identity less than 30%. Each probe sequence position is deened by one of 7 residue classes and 3 secondary structure classes. Each homologous fold position is deened by one of 7 residue classes, 3 secondary structure classes, and 2 burial classes. Thus the matrix is 5-dimensional and contains 7 3 2 7 3 = 882 elements or 3D-1D scores. The rst step in assigning a probe sequence to its homologous fold is the prediction of the 3 state (helix, strand, coil) secondary structure of the probe; here we use the PHD program. Then a dynamic programming algorithm uses the H3P2 matrix to align the probe sequence with structures in a representative fold library. To test the eeectiveness of the H3P2 matrix a challenging, fold class diverse, and cross-validated benchmark assessment is used to compare the H3P2 matrix to the GONNET, PAM250, BLOSUM62 and a secondary structure only substitution matrix. For distantly related sequences the H3P2 matrix detects more homologous structures at higher reliabilities than do these other substitution matrices, based on sensitivity versus speciicity plots (or SENS-SPEC plots). The added eecacy of the H3P2 matrix arises from its information on the statistical preferences for various sequence-structure environment combinations from very distantly related proteins. It introduces the predicted secondary structure information from a sequence into fold 1 recognition in a statistical way that normalizes the inherent correlations between residue type, secondary structure and solvent accessibility.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Some remarks on the sum of the inverse values of the normalized signless Laplacian eigenvalues of graphs

Let G=(V,E), $V={v_1,v_2,ldots,v_n}$, be a simple connected graph with $%n$ vertices, $m$ edges and a sequence of vertex degrees $d_1geqd_2geqcdotsgeq d_n>0$, $d_i=d(v_i)$. Let ${A}=(a_{ij})_{ntimes n}$ and ${%D}=mathrm{diag }(d_1,d_2,ldots , d_n)$ be the adjacency and the diagonaldegree matrix of $G$, respectively. Denote by ${mathcal{L}^+}(G)={D}^{-1/2}(D+A) {D}^{-1/2}$ the normalized signles...

متن کامل

The Kinetics and Mechanisms of Substitution Reactions of Trans-[Co(en)2CNCl]+ in Binary Mixed Solvent

The kinetics and mechanisms of the substitution reactions of trans-[Co(en)2CNCl]+ with unidentate anions,  , CN¯, I¯,  , Br¯ and SCN¯ in 60% v/v DMF-H2O binary solvent at 40.0±0.2 °C were studied spectrophotometrically. An Id mechanism was assigned for the replacement of chlorine by , CN¯ and I¯, an Ia one for...

متن کامل

MRA parseval frame multiwavelets in L^2(R^d)

In this paper, we characterize multiresolution analysis(MRA) Parseval frame multiwavelets in L^2(R^d) with matrix dilations of the form (D f )(x) = sqrt{2}f (Ax), where A is an arbitrary expanding dtimes d matrix with integer coefficients, such that |detA| =2. We study a class of generalized low pass matrix filters that allow us to define (and construct) the subclass of MRA tight frame multiwa...

متن کامل

Some results on the energy of the minimum dominating distance signless Laplacian matrix assigned to graphs

Let G be a simple connected graph. The transmission of any vertex v of a graph G is defined as the sum of distances of a vertex v from all other vertices in a graph G. Then the distance signless Laplacian matrix of G is defined as D^{Q}(G)=D(G)+Tr(G), where D(G) denotes the distance matrix of graphs and Tr(G) is the diagonal matrix of vertex transmissions of G. For a given minimum dominating se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997